Assessing anaerobic speed reserve: A systematic review on the validity and reliability of methods to determine maximal aerobic speed and maximal sprinting speed in running-based sports

Purpose Locomotor profiling using anaerobic speed reserve (ASR) enables insights into athletes’ physiological and neuromuscular contributing factors and prescription of high-intensity training beyond maximal aerobic speed (MAS). This systematic review aimed to determine the validity and reliability of different methods to assess the characteristics of ASR, i.e., MAS and maximal sprinting speed (MSS). Methods A comprehensive search of the PubMed and Web of Science databases was conducted according to the PRISMA guidelines. Studies were included if they reported data on validity and/or reliability for methods to assess MAS or MSS. Results 58 studies were included with 28 studies referring to MAS and 30 studies to MSS. Regarding MAS, different methods for cardiopulmonary exercise testing yielded different values (four out of seven studies) of MAS (Cohen’s d (ES) = 0.83–2.8; Pearson’s r/intraclass correlation coefficient (r/ICC) = 0.46–0.85). Criterion validity of different field tests showed heterogeneous results (ES = 0–3.57; r/ICC = 0.40–0.96). Intraday and interday reliability was mostly acceptable for the investigated methods (ICC/r>0.76; CV<16.9%). Regarding MSS, radar and laser measurements (one out of one studies), timing gates (two out of two studies), and video analysis showed mostly good criterion validity (two out of two studies) (ES = 0.02–0.53; r/ICC = 0.93–0.98) and reliability (r/ICC>0.83; CV<2.43%). Criterion validity (ES = 0.02–7.11) and reliability (r/ICC = 0.14–0.97; CV = 0.7–9.77%) for global or local positioning systems (seven out of nine studies) and treadmill sprinting (one out of one studies) was not acceptable in most studies. Conclusion The criterion validity of incremental field tests or shuttle runs to examine MAS cannot be confirmed. Results on time trials indicate that distances adapted to the participants’ sporting background, fitness, or sex might be suitable to estimate MAS. Regarding MSS, only sprints with radar or laser measures, timing gates, or video analysis provide valid and reliable results for linear sprints of 20 to 70 m.


Introduction
Assessing athletic performance is often performed in sports science and sports practice, e.g. to individualize training procedures.In a variety of team and individual sports and at different performance levels, endurance testing is a required and inevitably part of the training routine, more specifically to set training intensities [1].
Most currently used markers for endurance testing (e.g., lactate thresholds, maximal oxygen uptake (VO 2 max), critical speed) are limited to aerobic performance measures and therefore not applicable for the prescription of training in more intense exercise domains, e.g.intensities above VO 2 max [2].For this purpose, the anaerobic speed reserve (ASR) as the difference between maximal sprinting speed (MSS) and maximal aerobic speed (MAS) can be used to profile athletes in running type sports on a physiological (referred to MAS) and neuromuscular (referred to MSS) basis and to prescribe training intensities [1,3].By using proportions of ASR, exercise intensity beyond MAS can be set by normalizing absolute values of MAS and MSS that allows coaches or researchers to consider the individual tolerance to high-intensity exercise of an athlete [4].Furthermore, ASR can be used to identify athletes with similar or different characteristics.For example, most team sports require more focus on MSS and thus a higher ASR compared to longer distance events in endurance sports that mostly require a higher MAS and thus lower ASR [1].However within the same discipline, e.g. in track and field, ASR can differentiate into elite and sub-elite which was indicated by a strong relationship of ASR with 800-m running performance in world class middle-distance runners [3].
To assess ASR, i.e., MAS and MSS, different methods are used that can lead to different results and thus, e.g., different training prescriptions and subsequently adaptations to training [5].Cardiopulmonary exercise testing (CPET) including breathing gas analysis on a treadmill while employing an incremental protocol is considered as gold standard methodology to assess MAS [6,7].MAS can be defined as the first speed associated with VO 2 max [8,9], yet there is current debate about the exact procedure on how to define this speed.Di Prampero et al. [8] first defined the MAS as a calculated speed based on the ratio of the maximal fraction of VO 2 max and the energy cost of running, intended to describe the speed that a runner can sustain under aerobic conditions.Following, further definitions of MAS emerged as for example the speed at the onset of the VO 2 -plateau and therefore the maximal speed where mainly aerobic resources are used [10,11], the first speed associated with the 30-s interval of VO 2 max [12,13], or as the maximal speed reached during the incremental treadmill test [14].Although even for CPET as the gold standard method different definitions of MAS are used, several field tests are currently implemented to estimate MAS.These range from incremental continuous field tests like the Universite ´the Montre ´al Track Test (UMTT) or incremental shuttle runs to (set distance) time trials.Consequently, for MAS it is currently unclear which testing profile and which definition of the first speed when reaching VO 2 max is the most valid and reliable.
Regarding MSS, linear sprint tests of 20-50 m with radar or laser measurement are mostly used as gold standard methods [15][16][17][18].However, timing gates with 5-or 10-m split times or Global and Local Positioning Systems (GPS, LPS) are also very common methods.Moreover, GPS and LPS are implemented during matches or training (e.g.small-sided games) mainly in soccer to assess MSS [19].
Since the prescription of high-intensity training and the profiling of athletes based on ASR relies on valid and reliable testing of MAS and MSS to guarantee that any changes are not the result of measurement error (i.e., systematic and random error stemming from technological and biological sources) or intraindividual variances, the selection of measures needs to be carefully considered [20].While previous reviews focused for example on several influencing factors of sprint performance testing (such as temperature, wind, running surface or shoes) [21] or aerobic fitness tests to assess VO 2 max and often focused on only one type of sports [22], there exists no overview on the validity and reliability of testing methods for MAS and MSS in running-based sports.
Therefore, this review aims to systematically review the available literature on the validity and reliability of different methods to assess the two sub-components of ASR, i.e., MAS and MSS, in running-based sports, e.g., team sports, track and field, and runners at recreational and higher level.The results of this review can be of value for practitioners and scientists to choose the testing methodology that best meets their requirements.

Methods
This systematic review was written according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses; see PRISMA 2020 checklist in S1 Table ) guidelines [23] and no protocol was registered previously.

Eligibility criteria for included studies
To be included in the systematic review, the scientifically peer-reviewed publications had to meet the Patient, Intervention, Comparator, Outcome, and Study (PICOS) criteria (comparator criteria not applicable).Our search was limited to original articles published in peerreviewed journals and written in English.References cited by the articles retrieved were also examined for potential relevance.Conference abstracts, dissertations, theses, and other nonpeer-reviewed articles were excluded.Fig 1 illustrates the screening and selection process employed.
2.1.1Patients.All articles reporting on healthy active adults related to running with no restrictions on sex were included.
2.1.2Intervention.Studies were included if they specifically evaluated methods to assess MAS and/or MSS.Eligibility criteria for study inclusion consisted of one of the following: (i) tests performed two or more times during one occasion (intraday reliability) or on two or more separate occasions (interday reliability); (ii) compared against other methods (criterion and convergent validity).
2.1.3Outcome.All outcomes reporting validity and/or reliability of methods to assess MAS and/or MSS were included.If only split times during sprint tests were specified, the speed was calculated based on the time and distance by the authors of the present systematic review.

Study design.
Any original comparative studies were included.

Search strategy
A comprehensive search strategy was designed by the authors of this article.The electronic databases searched in August 2022 included PubMed and Web of Science (with no restriction concerning publication date) with an updated search being conducted in December 2022.The search strategy is illustrated using the search terms entered the PubMed database as example (S1 File) and was modified according to the indexing systems of Web of Science.
The following keywords were used to capture validity: validity, logical, criterion, convergent, discrimination, gold standard, level, standard.The following keywords were used to capture reliability: reliability, repeatability, reproducibility, measurement error, consistency, smallest worthwhile change, minimal detectable change, typical error, usefulness, sensitivity, relationship, relation, association, correlation.The following keywords were used to capture MAS: maximal aerobic speed, maximal aerobic velocity, MAS, velocity at VO2max, velocity associated with VO2max, vVO2max.The following keywords were used to capture MSS: maximal sprinting speed, maximal speed, MSS, maximal velocity, peak speed.The following keywords were used to capture ASR: anaerobic speed reserve, anaerobic velocity reserve, ASR.

Selection of studies
The identified articles were incorporated into the systematic review online tool Rayyan (rayyan.ai)where duplicates were eliminated.As previously performed and suggested [23,24], one of the authors (MT) examined the titles and abstracts of all possibly pertinent papers for eligibility, with independent verification by a second author (SA).The full texts of articles that met the criteria for inclusion were then retrieved and screened.When disagreements between reviewers arose, consensus was achieved through discussion or input from a third author (PD).

Data extraction and analysis
From the selected articles, one author (MT) extracted data which was independently confirmed by another (SA) and difficulties were resolved through discussion with the other authors.Extracted information concerned: details of publication, number of participants, demographic information (including sex, age, type of sports), testing methods with a short test description, reliability and validity type, outcome measures as well as results for validity or reliability, and the information required to assess the methodological quality of each study.
If possible, the mean difference, percentage difference and the respective effect sizes (ES as Cohen's d) between values of MAS or MSS from different methods or for different sampling time points where retrieved from the studies or calculated manually.ES (Cohen's d) was rated according to Cohen [25]: less than 0.2 was considered a trivial effect, 0.2�ES<0.5 small effect, 0.5�ES<0.8moderate effect, and ES�0.8 large effect.Values for the intraclass correlation coefficient (ICC), Pearson's r and the coefficient of variation (CV) were taken for reliability (both intraday and interday) and validity.CV is a measurement of absolute reliability and validity, whereas ICC and Pearson's r are indicators of relative reliability and validity.According to Hopkins [26] the magnitude of the correlation was considered to be small (0.1�r/ ICC<0.3),medium (0.3�r/ICC<0.5), large (0.5�r/ICC<0.7), very large (0.7�r/ICC<0.9), and almost perfect (r/ICC�0.9)classifications.The CV was interpreted in relation to each other [27].The analysis of the results was done descriptively.

Assessment of methodological quality
Based on the recommendation of Ma et al. [28], the methodological quality of the studies included in this review was assessed through the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist by using the boxes for reliability, measurement error, criterion validity, and convergent validity [29,30].The risk of bias was assessed independently by two of the authors (PD & MT), with any disagreements again being resolved by consensus or through discussion with a third author (LR) [31].
The score for each item was determined as follows: 3 = very good; 2 = adequate; 1 = doubtful; 0 = inadequate, and NA = not applicable.The quality of each study was rated with a worstscore-count method to determine the risk of bias [29].Further evaluation of the methods' validity (i.e., criterion validity) and recommendations for practical application were based on the studies using an accepted gold-standard method (i.e., CPET for MAS; radar or laser for MSS) and achieving a methodological quality score of at least 2 (adequate).
An overview of the study characteristics, including populations' characteristics, a short description of testing methods, reliability/validity type, and main results is given in Table 1 for MAS and in Table 2 for MSS.Additionally, Table 3 presents a descriptive overview of the participants' age, the contribution of sexes, and the sporting background.

Main findings for criterion validity and reliability
Incremental field tests.Validity of incremental continuous field tests (UMTT, VAM-EVAL, track-individualized short ramp graded test, or National University of Catamarca test) to estimate MAS were tested 12 times against CPET on a treadmill as the reference method.The mean difference between the estimated MAS and the MAS retrieved by CPET ranged from -2.0-1.61km/h (ES = 0-2.03;r/ICC = 0.83-0.96).Intraday or interday reliability was not reported for incremental continuous field tests.Time trials.Validity for time trials (1500 m, 3200 m, or 5 min) to estimate MAS was examined nine times.The mean difference from the gold standard, i.e., CPET, ranged from 0.1-2.06km/h (ES = 0.08-1.75;r/ICC = 0.51-0.94).The interday reliability of time trials was reported as good in one study with a mean difference of 0.22 km/h (ES = 0.14-0.17;ICC = 0.88-0.95;CV = 2.0-3.3%).Shuttle runs.Results from different shuttle runs (futsal specific shuttle test, 30-15 IFT, Carminatti's shuttle test, 20-m shuttle test) to estimate MAS were compared to CPET four times for validity testing.The mean differences ranged from -0.7-4.0 km/h (ES = 0-2.04)with a correlation between 0.55 and 0.93.Regarding interday reliability, one study reported results for the 30-15 IFT with a mean difference of 0.3 km/h (ES = 0.29; ICC = 0.91; CV = 1.8%).Intraday reliability from one study for the 3-min 30-second Endurance Capacity Test showed a mean difference of -0.2 km/h (ES = 0.11; ICC = 0.98; CV = 2.0%).
Other/individualized methods.In two studies, validity of other fitness tests to estimate MAS were investigated.The comparison of the self-paced submaximal running test in which the velocities of the stages were self-chosen according to ratings of perceived exertion (RPE) 10,  13, and 17 with CPET showed mean differences between -0.27 and -4.43 km/h (ES = 0.20-3.57;r/ICC = 0.50-0.66).Interday reliability was reported for each test (n = 2).Results for the mean differences were -0.32-0.20 km/h (ES = 0.03-0.28;r/ICC = 0.76-0.99;CV = 3.9-16.9%).Intraday reliability was not examined.

Maximal sprinting speed.
Reference methods used for validity testing.As a gold standard method to assess MSS, 20-100-m linear sprints with radar or laser measurement were used in 11 studies investigating validity.30-40-m sprints with 5-or 10-m split times and timing gates were used in eight studies, a 40-m sprint with GPS measurement was used in four studies, 25-26.5-msprints with a motion capture system were used in two studies, a 45.72-m sprint with video analysis via a smartphone app was used in one study, and a 40-m sprint with a linear encoder was also used in one study as a reference method to assess MSS.In one study, data were collected via online-sources and the reference MSS was modelled mono-exponentially based on the data.
Radar/Laser technology.In one study validity of radar and laser measurement was tested with a linear encoder as the reference method.The results differed between -0.11 and -0.07 km/h (ES = 0.02-0.03).Intraday reliability was reported four times and ranged from 0.04-0.29 km/h (ES = 0.02-0.08;ICC>0.95;CV<1.8%).Interday reliability was reported two times for radar/laser measurement with a mean difference of -0.18 km/h (ES = 0.09; ICC = 0.83-0.98;CV<2.4%).
Video analysis via smartphone applications.Validity of video analysis with a smartphone app was examined two times with radar/laser measurement.The results differed between 0.25 and 1.12 km/h (ES = 0.07-0.53;r/ICC = 0.92-0.99).Intraday reliability was determined two times with an ICC>0.97 and CV<1.6%.Interday reliability was not examined.
Treadmill sprinting.Validity of a sprint on a motorized treadmill with piezoelectric force transducers with radar as the reference on a running track was determined one time.The results showed a mean difference of -6.98 km/h (ES = 4.27; r = 0.89).Interday and intraday reliability for the non-motorized treadmill was examined one time each with an CV of 1.8 and 1.2%, respectively.

Overview
The aim was to systematically review the scientific literature regarding the validity and reliability of different methods to assess ASR, i.e., methods to assess MAS and MSS.The high number of studies and methods included in this review emphasizes the popularity of both MAS and MSS in research and practice.As a combination of MAS and MSS, ASR has the advantage of normalizing absolute values for individualizing high-intensity training above MAS to take individual tolerances to high-intensity efforts into account [1,4].
Regarding MAS, CPET on a treadmill is mostly used as a gold standard method (21 out of 28 studies; criterion validity), but with different protocols or definitions of MAS.The most studied methods could be assigned to CPET (n = 19) and time trials (n = 20), followed by incremental continuous field tests such as the UMTT (n = 12).The least studied methods were shuttle runs (n = 10).Intraday (n = 1) and interday reliability (n = 3) of methods to assess MAS were investigated equally but overall very rarely.
With respect to MSS, radar measurements (11 out of 28 studies; criterion validity) and distances between 20 and 40 m (23 out of 28 studies) are mostly used as gold standard methods for validity testing.Validity was assessed for GPS/LPS measurements most times (n = 18), followed by timing gates including different split distances (n = 8), and video analysis (n = 2).Radar and laser were (n = 2) validated against a linear motorized encoder.Intraday reliability (n = 22) was investigated remarkedly more frequently than interday reliability (n = 2).

Maximal aerobic speed 4.2.1 Cardiopulmonary exercise testing.
Although CPET on a treadmill is considered as the gold standard method to assess MAS, the testing protocols or definitions, i.e., the methods on how to retrieve MAS based on CPET, are yet to be clarified.The most used protocols are (square-wave) incremental protocols with and without resting in between the stages with different starting velocities, increments or duration of stages, or protocols based on individual performance, e.g., starting speed related to theoretical maximal heart rate [6, 7, 10, 12-14, 32, 34-37, 39, 42-46, 48-50, 52].The definitions of MAS used in the studies were also different, e.g., the first speed when VO 2 max occurred with no further specification [6,39,45,46,48,49,52], calculated based on VO 2 kinetics [7,32,34,35,43,44,50] the first speed of the 30-s interval of VO 2 max [7,12,13,50], the speed at the onset of the plateau of VO 2 [10,42], or the final speed reached during CPET [14,[35][36][37].Regarding validity of CPET on a treadmill to determine MAS, different incremental continuous protocols yielded similar results when the increments and stage durations were multiples of each other (e.g. 1 km/h increment and 2 min duration of stages versus 0.5 km/h increment and 1 min duration of stages) [6].For the same increments, longer durations of stages lead to lower results of MAS (-2.1±0.5 --2.0±1.45 km/ h) [7,50].MAS shows significantly higher values when using square wave protocols (with resting periods) than when examined by incremental continuous protocols [7,50].Because different protocols used in the CPET yield different results in MAS, no conclusion can be drawn in consideration of the included studies.However, taken the definition of MAS into account, i.e., the first speed associated with VO 2 max, it is crucial for the subjects to actually reach VO 2 max during the treadmill test.That can be verified by the incidence of a VO 2 plateau or other criteria measures such as blood lactate, respiratory exchange ratio, heart rate, or the rating of perceived exertion [81].Regarding the protocol, continuous incremental exercises with durations between 8 and 16 min (e.g., starting at 4 or 6 km/h for recreational subjects and at 8 or 10 km/ h for trained subjects with 1 km/h increments every minute) are suggested to reach VO 2 max and might therefore also be suitable for assessing MAS [81,82].Additionally, the duration of stages should potentially be chosen differently depending on the background of the athletes.For example, athletes accustomed to prolonged running (e.g., long distance runners) might benefit from longer stages (i.e., 2 or 3 min) whereas athletes accustomed to shorter or more intense running (e.g., some team sports athletes or 400-m runners) would suffer from neuromuscular exhaustion during longer stages and therefore benefit from shorter stages (i.e., 1 min).When MAS was determined at the onset of the plateau of VO 2 max, i.e., VO 2 reaching a steady-state during an incremental treadmill test, the results differed remarkedly from MAS determined at the 30-s interval of VO 2 max [10].These discrepancies might be explained by the definition of a leveling-off of VO 2 lasting at least 1 min [83] and the 30-s interval of VO 2 max almost always occurs at the end of an incremental exercise close to the maximal speed [10,11].Further results regarding comparisons of definitions of MAS indicate that the final speed reached during incremental exercise as well as velocities based on calculations or extrapolations of VO 2 to assess MAS are associated with a higher input of anaerobic resources than the first speed at the steady-state of VO 2 , i.e. the speed at the onset of the plateau of VO 2 max and therefore the first speed associated with VO 2 max when mainly aerobic resources are used [35,39].Such considerations should be taken into account when assessing MAS with CPET to determine the first speed associated with VO 2 max at which energy production is still largely aerobic [10,11].Especially for the purpose of HIIT prescription based on MAS or ASR, using definitions as the speed of the 30-s interval of VO 2 max or the final velocity reached during CPET instead of the velocity at the onset of the VO 2 plateau could lead to different adaptations as intended or even overtraining.
The one study addressing interday reliability for MAS determined by CPET reported good relative reliability.However, since different protocols on the treadmill and different definitions of MAS are currently used and partly differ regarding their results, further studies should address validity and especially reliability of a consistent definition of MAS.

Incremental continuous field tests.
Le ´ger et al. [84] developed and validated an incremental continuous field test to indirectly determine VO 2 max-the UMTT.Further, the UMTT and variations of it (e.g., VAM-EVAL) were then used to estimate MAS as they are less time and material consuming and need less expertise to implement than CPET [44,85].Regarding the validity of these field tests, the results depend on the protocols and definitions used in the reference measure and in the field test.In all of the studies (n = 8) investigating the validity of incremental continuous field tests to estimate MAS, CPET was used as the reference measure (criterion validity).When the identical protocols were implemented in the CPET and the field tests, two studies reported good validity [45,48].However, one study reported significant differences [52].Lopes et al. [45] and Pallare ´s et al. [48] validated the UMTT, whereas Cappa et al. [52] developed a new field test (i.e., University of Catamarca test) and validated it with CPET on a treadmill with the same protocol.The University of Catamarca test consisted of a hexagon with 20-m long sides so that the participants ran around corners and not as in the UMTT on a linear and curved track potentially leading to an earlier exhaustion in the Catamarca test [52].Comparisons of the final speed reached in the UMTT with the MAS calculated according to Lacour et al. [44] showed good validity [35,44], whereby heterogenous results were found when the maximal speed during CPET [35,36], or the speed associated with VO 2 max [6,45,48]were used as the definitions of MAS during CPET.The final velocity reached during the UMTT resulted in higher values (+1.61 km/h) than the speed at the onset of the VO 2 plateau (as assessed with CPET) [10].Therefore, using the final velocity of the UMTT to estimate MAS and prescribe training, e.g.HIIT, might lead to higher intensities as intended what can be fatal in terms of maladaptation or even overtraining.Although no study on reliability could be included, Le ´ger et al. [84] reported good reliability of the maximal metabolic equivalents related to the UMTT.

Time trials.
Set-distance time trials with different distances or 5-min time trials are often used in research and practice to estimate MAS because of an easy implementation even for larger groups [3,33].While the average speed during 5-min time trials achieved similar results as the final speed reached during CPET or the UMTT in men of different fitness or amateur soccer players [36,37,40], the results on the validity of set-distance time trials are heterogeneous.When 1500 or 1610-m time trials were compared to CPET (criterion validity), all of the studies reported an overestimation of MAS investigating different participants, such as male and female runners, male Australian rules football players, or male trained soccer players, and with different definitions and protocols used in the CPET [10,12,43,44,46].Bellenger et al. [33] and Lundquist et al. [47] investigated set-distance time trials between 1200 and 2200 m with male and female Australian rules football players.However, they used the final speed reached in the UMTT as the reference measure (convergent validity).For female subjects, the smallest effect and largest relationship was examined between the 1400-m time trial and the UMTT [47], whereby for male subjects the smallest effect was found for the 2000-m time trial [33].These results indicate that the validity of set-distance time trials depends on which distance is used for which sex, type of sports, or fitness.For this reason, Bellenger et al. [33] suggested to predict MAS from the running speed for any time trial distance between 1200 and 2200 m with the equation: MAS = average speed * (0.766 + 0.117 * distance).However, the validity of this equation could not be confirmed for a 1500-m time trial in trained male soccer players [10].When the studies investigating criterion validity of time trials with an accepted gold standard, i.e.CPET, are considered (n = 6), the results indicate that the time or distance should be selected according to the background of the subject (sex, type of sports, fitness).For lower fitness or subjects associated with sports based on higher speeds and mainly anaerobic energy supply, e.g.most team sport players or sprinters, shorter distances might be favorable (e.g., 1200-1400 m), whereas for (endurance) trained athletes longer distances (e.g., 1500-2000 m) could reflect an average speed more similar to MAS [10,36,37,43,44,46] 4.2.4Shuttle runs.Shuttle runs are defined by stages with increasing speed or distance and changes of direction, often by 180˚.Most shuttle-runs include active or passive rest [11].While the peak speed of shuttle runs with passive rest (futsal specific shuttle run; 30-15 IFT) overestimated MAS assessed during CPET [14,32], shuttle-runs with active or no rest (Carminatti's test, 20-m shuttle run) yielded more similar results to CPET (criterion validity) [42,49].The passive rest could have provided an opportunity to recover in-between the shuttles so that a higher final speed was reached [11,14].The results for the 30-15 IFT and the 3-min 30-s endurance capacity test, showed good interday and intraday reliability, respectively [14,53].When only studies using an accepted gold standard to assess MAS, i.e.CPET (n = 4; criterion validity), are considered, most of the studies (two out of three) determining final speed during a shuttle run could not confirm its validity (large overestimation by shuttle runs) [14,32].These results might be mainly due to shuttle runs demanding more anaerobic resources and a higher energy cost of running because of change of direction movements [11].These results indicate that MAS cannot be examined by the final speed reached during a shuttle run.4.2.5 Other/Individualized tests.Sangan et al. [13] validated a field test that is based on individual RPE during three stages (RPE 10, 13, and 17) lasting three minutes each.The comparisons of the average velocities during the last minutes of the three stages and MAS retrieved by CPET showed poor criterion validity, while absolute and relative intraday reliability can be rated as good.While validity for the three-level tests needs to be investigated further with CPET, validity of a field tests based on RPE cannot be confirmed [13,51].

Maximal sprinting speed 4.3.1 Radar/Laser technology.
As the MSS reflects the maximal overall possible sprinting speed a subject can reach, it marks the upper limit of the ASR by providing information about the maximal physiological, mechanical, and coordinative output during sprinting [1].Measuring the speed profile during linear sprinting with laser or radar devices is considered as the gold standard method to examine MSS [18,76].One study examined and confirmed convergent validity of both radar and laser measurement during 40-m sprints to assess MSS with a linear motorized encoder as the reference measure [70].The validation of accepted gold standard methods becomes challenging in the absence of alternative gold standards.However, previous studies have demonstrated the accuracy of laser and radar devices in assessing various sprinting characteristics [21].Both absolute and relative intraday reliability was reported good in all studies (n = 4) [17,70,71,79] as well as absolute and relative interday reliability (n = 2) [17,71].
4.3.2Timing gates.Timing gates are commonly used to assess sprinting characteristics as they conveniently provide sprinting times for distinct distances [27,79].When the average speed assessed during 5-and 10-m splits was compared to MSS measured with radar or a linear motorized encoder [62,70,79], the results indicate that 10-m splits are also sufficient to assess MSS with timing gates compared to 5-m splits.Moreover, Zabaloy et al. [79] did not detect significant differences in the 20-25 and 25-30-m splits indicating that participants reached MSS between 20 and 30 m.With respect to the total distances for the assessment of MSS, distances between 20 and 100 m are used in the included studies [60,69].Although no study compared different total distances, results of comparisons of the split times indicate, that distances between 20 and 40 m should cover an achievement of MSS for team sports athletes [57,63,64,80].Conversely, studies investigating trained track and field sprinters revealed an achievement of MSS between 40 and 70 m [86,87].Regarding the intraday reliability of timing gates during 30-50-m linear sprints, good absolute and relative reliability was reported [65,70,79,80].Although no studies on interday reliability of assessing MSS with timing gates could be included in this review, studies investigating other sprinting characteristics (e.g., acceleration) confirmed interday reliability of timing gates.Therefore, it can be assumed that interday reliability would be acceptable for MSS as well [27,88].In sum, timing gates with 5-or 10-m split times seem to be a valid and reliable method to assess MSS during linear sprints between 20 and 40 m for team sports with indications that elite sprinters might need longer distances to reach MSS [57, 62-65, 70, 79, 80, 86].However, no results regarding total distances and split times for the examination of MSS with timing gates are available for recreational subjects who might need different distances due to the lack of expertise and lower fitness.

Global and local positioning systems.
GPS and LPS have recently become more accessible and affordable to assess running or sprinting characteristics, e.g.MSS.They allow receiving live speed and distance data from multiple subjects simultaneously during testing sessions but also during training or matches [59].Regarding the validity of GPS measures during linear sprinting to assess MSS, some of the studies using radar as the reference method (n = 6) reported good criterion validity for 10 Hz GPS (three out of four) [18,62,77].However, Akyildiz et al. [66] examined rather poor criterion validity for 10 Hz GPS with comparable samples (i.e., male amateur soccer players) and sprinting distances (i.e., 40 m).In addition, 5 Hz and 16 Hz GPS also yielded different results in MSS compared to radar (criterion validity) [15,75].20-45 Hz LPS systems obtained rather poor validity when compared to motion capture [58,61], indicating that LPS measures are not yet applicable for the assessment of MSS.These results seem surprising as LPS was considered a valid technology to determine other running characteristics, though Pino-Ortega et al. [89] point out that validity of LPS is highly dependent on the data processing.Most of the studies (seven out of 10) reported good intraday reliability for MSS assessed with 1-20 Hz GPS/LPS [67,69,70,73,74,77,78] and the one study that assessed interday reliability of 10 Hz GPS showed good absolute reliability [18].To conclude, most studies using an accepted gold standard method to assess MSS, i.e. radar technology, could not confirm criterion validity of GPS or showed inadequate methodological quality (four out of six studies) [15,62,66,75].LPS measurements showed poor criterion and convergent validity in all studies relating to motion capture (n = 2) [58,61].Interday (one out of one study) and intraday reliability (seven out of 10 studies) was reported as good for GPS and LPS in most studies [18,67,69,70,73,74,77,78].

Video analysis via smartphone applications.
A relatively cheap and easily accessible method to assess sprint kinematics is video analysis via smartphone applications that record time and distance data to calculate for example velocities.Two studies validated a smartphone app (MySprint App) with laser or radar measurements.Results regarding MSS showed good criterion validity in male and female team sport athletes and in male trained sprinters as well as good intraday reliability [71,76].The results of these two studies investigating a smartphone app indicate that this method can be used validly and reliably to assess MSS; however, with respect to the small number of studies, further investigations are crucial.
4.3.5 Treadmill sprinting.Since MSS is usually measured during linear sprinting on a (outdoor) track, several environmental conditions can influence the outcome measures.Therefore, standardizing by implementing treadmills in the laboratory was previously suggested [72].However, the studies examining (non-) motorized treadmills to assess MSS showed poor convergent and criterion validity (large ES) as running on a treadmill yielded much lower MSS than overground running [60,72].This difference is likely a result of the participants having to overcome the high inherent resistance of the treadmill.Though, intraday and interday reliability was reported as good [72].

Strengths and limitations
4.4.1 Strengths.The results of this review should be interpreted concerning its limitations and strengths as well as the limitations and strengths of the included studies.A strength of this article is the large number of studies (n = 58) that was included in the analysis.In addition, the studies' participants were associated to different backgrounds allowing conclusions for several fitness and settings for adults.A major methodological strength of this review is the application of the COSMIN checklist which is especially suited to evaluate the risk of bias of validity and reliability studies [29,30].

Limitations.
A limitation is the unequal numbers of studies that could be included for the investigated methods, e.g.only few studies on shuttle runs to estimate MAS or on video analysis to assess MSS.This issue needs to be addressed in the future, so that clearer conclusions can be drawn.Studies in this article can be criticized since a substantial number of studies (50%) used not as gold standard accepted reference methods.Methodological quality of studies reporting validity for the assessment of MSS were mostly rated inadequate or doubtful (70%).For interday and intraday reliability, 50% of the studies were rated very good or adequate, and 50% were rated doubtful or inadequate.These findings can be explained by the inclusion criteria, as there was no restriction on the type of studies.Hence, studies in which validity or reliability assessment for MSS was not the main objective were included as well.Although they were well designed for their main purpose, the information needed for validity or reliability assessment was not always provided.Additionally, many studies reporting validity used very small sample sizes and hence were rated with a low methodological quality.Another limitation of the included studies is the fact that in studies on MAS as well as MSS mostly male participants were used and very few female participants.Moreover, in studies examining MSS mostly team sport athletes were included, potentially because MSS is rather crucial in team sports and sports teams are attractive samples as they usually consist of more than 20 players per team [21].

Practical applications & future research
When performance tests are implemented to monitor individual performance changes or to retrieve values for training prescription, these methods have to be highly valid and reliable [90], i.e., with respect to ASR, the methods to assess MAS and MSS.Practitioners and researchers should be aware that the results of methods for CPET to assess MAS indicate that the final speed reached during the test or the speed of the 30-s interval of VO 2 max are associated with a higher input of anaerobic energy supply than when the speed at the onset of the plateau of VO 2 is determined.Therefore, the speed at the onset of this plateau seem to reflect true MAS and should be assessed [10,91].The results for time trials indicate that distances need to be selected based on the fitness, sex, or type of sports of the participants so that the results for MAS might be valid.We recommend that for subjects with a lower fitness, team sports athletes, or sprinters shorter distances should be used (e.g.1200-1400 m), while longer distances are more favorable for endurance trained athletes to estimate MAS (e.g.1500-2000 m) [10,12,33,43,47].In addition, the results for continuous incremental field tests or shuttle runs independently of subjects' fitness [10,14,32,45,48,52] point out that these tests are not suitable to estimate true MAS.
Besides using radar or laser measurement during linear sprinting [17,70,79] timing gates with split times of 5 or 10 m and video analysis with a smartphone application seem promising when determining MSS [62,71,76,79].GPS measurements during linear sprinting showed poor validity and the maximal velocities measured during training and matches [15,19,59,62,66] as well as the maximal speed achieved during treadmill sprinting could not reach the actual MSS [15,19,59,60,62,66,72].However, to delineate between type of sports, sexes, or fitness, future studies should address population specific validity.From a practical perspective, shorter total distances (20-40 m) are suggested for recreational athletes or type of sports in which shorter sprints are common, e.g.team sports such as soccer or handball, whereas trained sprinters might need distances between 40 and 70 m to reach MSS [57, 62-65, 70, 79, 80, 86].It is noteworthy that several influencing factors such as wind, temperature, running surface, or shoes can affect the sprint performance and thus potentially MSS.These factors should therefore always be considered when determining sprint performance, especially when assessing changes in intraindividual performance over time [21].
Both the selection of methods and the evaluation of results for ASR should be based on the sport-specific requirements.While different team sports such as soccer or volleyball have different game demands (i.e.shorter sprints and lower total distance covered during a volleyball match compared to soccer), the testing methods might also differ, e.g.lower time trial distances to assess MAS and lower sprinting distances to assess MSS for volleyball players compared to soccer players.However, future research might focus on these aspects.An overview of the conclusions and practical applications is presented in Fig 2.
With respect to controlling training interventions, an example of high-intensity-interval prescriptions with ASR might illustrate the issue of using different methods when assessing the control parameters.When MAS and MSS are retrieved by CPET and radar measurement, respectively, a recreational runner might achieve results of 15 km/h for MAS and 28 km/h for MSS and accordingly an ASR of 13 km/h.However, when for instance the 30-15 IFT is used instead of CPET to set the lower boundary of ASR, the estimated MAS might yield 19 km/h what reduces ASR to 9 km/h [14].The prescription of high-intensity-intervals with the intensity of for example 30% ASR (MAS + 30% ASR) will therefore be noticeably more intense with MAS retrieved by the 30-15 IFT than with CPET (i.e., 18.9 km/h for true MAS and 21.7 km/h for 30-15 IFT).Conversely, the intensity would be too low (approx.18.2 km/h) for the athlete when MSS is assessed for example during matches or training using GPS [19] (see Fig 3).Though, only few studies exist on training interventions using ASR [1,[92][93][94], further research is needed to fill this gap.

Conclusions
Beyond assessing single performance parameters, the ASR can provide further insights into an athlete's physiological and neuromuscular profile by considering the individual tolerance to high-intensity exercise.As ASR consists of the parameters MAS and MSS, the methods to assess these parameters need to be valid and reliable.
MAS can be defined during CPET, yet there are different testing protocols without a consensus about the most appropriate one.Due to physiological considerations based on energy supply, the speed at the onset of the VO 2 -plateau seems to be the most appropriate method to determine true MAS.For field tests, studies' results on validity are heterogeneous and do not favor a specific field test (incremental continuous or shuttle runs) for the determination of MAS.However, results on time trials indicate that distances adapted to the subjects' sporting background, fitness or sex might be suitable to estimate MAS.
Regarding MSS, linear sprints using timing gates or video analysis seem to provide valid and reliable results besides the gold standard method, i.e. radar or laser measurements.The validity of GPS or sprinting on a treadmill cannot be confirmed.Sprinting distances between 20 and 40 m should be selected for recreational subjects or type of sports in which shorter sprints are crucial, e.g.team sports, whereas trained track and field sprinters might need longer total distances (40 to 70 m) to reach MSS.In particular the use for prescribing training emphasizes the importance of valid and reliable measurements of MAS and MSS to achieve optimal and desired intensity based on ASR.Methods-ideally with a low measurement error and therefore a high reliability-should be maintained throughout the training routine so that changes in ASR can be attributed to changes in the individual performance and not to differing results because of the testing method.[14] for the MAS data and Djaoui et al. [19] for the MSS data.When GPS is used, the MSS will most likely be underestimated compared to radar (approximately 3 km/ h) (17).MAS will most likely be overestimated when the 30-15 IFT is implemented compared to CPET on a treadmill (approximately 4 km/h) [12]

Fig 2 .
Fig 2. Overview of conclusions for the testing methods for maximal aerobic speed and maximal sprinting speed.The color grading reflects the rating for criterion validity-from green indicating good validity to red indicating poor validity.VO2 Oxygen Uptake, MAS Maximal Aerobic Speed.https://doi.org/10.1371/journal.pone.0296866.g002 of methodological quality based on the boxes 6 -9a of the COSMIN checklist [29, 30].(DOCX) S1 File.Search terms PubMed.(DOCX)

Fig 3 .
Fig 3. Illustration of values for MAS, MAS and ASR when using different testing methods.Presented values are based on the results of Čović et al.[14] for the MAS data and Djaoui et al.[19] for the MSS data.When GPS is used, the MSS will most likely be underestimated compared to radar (approximately 3 km/ h)(17).MAS will most likely be overestimated when the 30-15 IFT is implemented compared to CPET on a treadmill (approximately 4 km/h)[12].The ASR will change accordingly.MAS Maximal Aerobic Speed, MSS Maximal Sprinting Speed, ASR Anaerobic Speed Reserve, CPET Cardiopulmonary Exercise Testing, GPS Global Positioning System, 30-15 IFT 30-15 Intermittent Fitness Test.
Fig 3. Illustration of values for MAS, MAS and ASR when using different testing methods.Presented values are based on the results of Čović et al.[14] for the MAS data and Djaoui et al.[19] for the MSS data.When GPS is used, the MSS will most likely be underestimated compared to radar (approximately 3 km/ h)(17).MAS will most likely be overestimated when the 30-15 IFT is implemented compared to CPET on a treadmill (approximately 4 km/h)[12].The ASR will change accordingly.MAS Maximal Aerobic Speed, MSS Maximal Sprinting Speed, ASR Anaerobic Speed Reserve, CPET Cardiopulmonary Exercise Testing, GPS Global Positioning System, 30-15 IFT 30-15 Intermittent Fitness Test.https://doi.org/10.1371/journal.pone.0296866.g003

3.3.1 Maximal aerobic speed.
Reference methods used for validity testing.As a reference method for validity testing, MAS was assessed by CPET on a treadmill in 21 studies (criterion validity), by incremental continuous field tests (UMTT or Vitesse Aerobie Maximale Evaluation (VAM-EVAL)) in six studies, and by the 30-15 Intermittent Fitness Test (30-15 IFT) in one study (convergent validity).

Table 2 .
(Continued) The method claimed as the reference method in validity studies is underlined.Criterion validity refers to the comparison with an accepted gold standard method (i.e., CPET on a treadmill for MAS and radar, laser, or motion capture for MSS); convergent validity refers to the comparison with any other method.MQ-Methodological quality (3 = very good; 2 = adequate; 1 = doubtful; 0 = inadequate); MD-Mean difference; ES-Effect size/Cohen´s d; ICC-Intraclass Correlation Coefficient; CV-Coefficient of variation; GPS-Global Positioning System; Hz-Hertz; MSS-Maximal Sprinting Speed; LPS-Local Positioning System; NA-Not Available; NMT-Non-Motorized Treadmill.https://doi.org/10.1371/journal.pone.0296866.t002